Factors associated with diagnosis of stages I and II lung cancer: a multivariate analysis

ABSTRACT OBJECTIVE To present the overall survival rate for lung cancer and identify the factors associated with early diagnosis of stage I and II lung cancer. METHODS This is a retrospective cohort study including individuals diagnosed with lung cancer, from January 2009 to December 2017, according to the cancer registry at UMass Memorial Medical Center. Five-year overall survival and its associated factors were identified by Kaplan–Meier curves and Cox’s proportional hazards model. Factors associated with diagnosing clinical stage I and II lung cancer were identified by bivariate and multivariate backward stepwise logistic regression (Log-likelihood ratio (LR)) at 95% confidence interval (CI). RESULTS The study was conducted with data on 2730 individuals aged 67.9 years on average, 51.5% of whom female, 92.3% white, and 6.6% never smoked. Five-year overall survival was 21%. Individuals diagnosed with early-stage disease had a 43% five-year survival rate compared to 8% for those diagnosed at late stages. Stage at diagnosis was the main factor associated with overall survival [HR = 4.08 (95%CI: 3.62–4.59)]. Factors associated with early diagnosis included patients older than 68 years [OR = 1.23 (95%CI: 1.04–1.45)], of the female gender [OR = 1.47 (95%CI: 1.24–1.73)], white [OR = 1.63 (95%CI: 1.16–2.30)], and never-smokers [OR = 1.37 (95%CI: 1.01–1.86)]; as well as tumors affecting the upper lobe [OR = 1.46 (95%CI: 1.24–1.73)]; adenocarcinoma [OR = 1.43 (95%CI: 1.21–1.69)]; and diagnosis after 2014 [OR = 1.61 (95%CI: 1.37–1.90)]. CONCLUSIONS Stage at diagnosis was the most decisive predictor for survival. Non-white and male individuals were more likely to be diagnosed at a late stage. Thus, promoting lung cancer early diagnosis by improving access to health care is vital to enhance overall survival for individuals with lung cancer.


INTRODUCTION
Lung cancer is the leading cause of cancer death worldwide for both men and women. In 2020, lung cancer accounted for the second most common cancer around the world, with 2.21 million cases and 1.80 million deaths 1 . From 2009-2013, deaths due to this cancer in the United States surpassed the records for breast, prostate, colorectal, and liver cancer combined. Moreover, authorities estimate that 131,880 deaths from this disease will occur in the country in 2021 2 .
Stage at diagnosis is the most decisive factor for lung cancer survival. The relative five-year survival rate for localized, stage I, non-small cell lung cancer (NSCLC) is approximately 57.4%, compared to 5% for distant metastases. Surgical resection is the most effective treatment for NSCLC; however, approximately 40% of NSCLC patients are diagnosed at stage 4 3 .
Ethnic disparities exert significant influence on lung cancer diagnosis and treatment. Studies found higher mortality and incidence rates of the disease among African-Americans and non-Hispanics 2,3 . Moreover, Hispanics with stage I lung cancer had lower survival rates when compared with white individuals, besides presenting lower rates of lung resection and higher percentages of late-stage diagnosis [4][5][6] . Among other factors, such differences could be explained by the various cultural factors influencing the perception of minoritized groups regarding healthcare, including negative surgical beliefs and mistrust towards the healthcare system and providers 4 .
Individuals with low socioeconomic status and living in resource-deprived areas are at greater disadvantage when seeking timely treatment for lung cancer, which is partly attributed to the time taken to travel to these services. These patients often take longer to attain a proper and timely histological diagnosis and further treatment, which has a negative impact on their survival 7,8 .
For most cancers, early-stage diagnosis is associated with an increased overall survival 9,10 .
Although several studies have investigated factors associated with the early-stage diagnosis of cancers affecting oral cavity 10 , breast 9 , and ovary 11 , research on factors associated with lung cancer early diagnosis are still scarce in the literature. Thus, understanding factors associated with early-stage diagnosis is important for identifying possible interventions in the health system and community levels to improve diagnosis and, consequently, survival. We hypothesize that delayed diagnosis and treatment for lung cancer is associated with a series of socioeconomic barriers, thus requiring initiatives to improve the access of minoritized populations and reducing disparities.
This study aims to estimate overall survival for lung cancer and to identify factors associated with diagnosis of clinical stages I and II of lung cancer, including race, socioeconomic status, health insurance status, and education level.

METHODS
This is a retrospective cohort study conducted with individuals aged 18 years and older who were registered as lung cancer patients at the institutional cancer registry (CR) from January 2009 to December 2017. The study was approved by the Institutional Review Board of the Medical School of University of Massachusetts, under IRB ID: H00008342.
Being part of the University of Massachusetts Cancer Program, accredited by the Commission on Cancer of the American College of Surgeons, the institutional CR was created in 1999. Every year, CR data is directly submitted to the National Cancer Database (NCDB), meeting all NCDB timeliness and data quality criteria. The registry also meets all federal and state requirements, so that incidence rates of all cancer cases diagnosed and/or treated in our medical center are reported to the Massachusetts State Cancer Registry -awarded the North American Association of Central Cancer Registries (NAACCR) Gold Standard for quality, completeness, and timeliness 12 . From this body, cases are reported to the Centers for Disease Control (CDC).
The CR staff collected all the required information on cancer patients through manual record review, following the North American Association of Central Cancer Registries (NAACCR) requirements. Cases are identified by an electronic medical record interface (EPIC interface) that considers information from the pathology department and departments using diagnostic codes from the disease index. This workflow guarantees that the CR is notified of all malignant cases, thus ensuring compliance with federal and state reporting laws. Established by Congress through the Cancer Registries Amendment Act in 1992 and administered by CDC, the National Program of Cancer Registries (NPCR) collects data on cancer occurrence, including type, extent, and location, as well as on the type of initial treatment and outcomes. In the last five years, loss to follow-up was 9.38 %, and the compliance targets for Standard 6.5 was 90%. Every abstractor is subject to a quality review by a third-party company, and CR abstractors scores ranged from 96.78% to 99.52%, which is higher than the required rate of 92%.
Located in the city of Worcester, Central Massachusetts, the health system aims to promote culturally-sensitive excellence in clinical care, service, teaching, and research, thus improving the health of people from diverse communities of Central New England. Despite the high insurance coverage in the state of Massachusetts, disadvantaged patients such as those living in Worcester have more comorbidities and greater need for health education 13 .
This study primary outcome was diagnosis of clinical stage I and II (1, 1A, 1B, 2, 2A, and 2B) lung cancer, whereby the percentage of individuals diagnosed at an early stage was calculated. Clinical staging is based on any information on the extent of the cancer obtained before initiation of thel definitive treatment 14 .
The secondary outcome was overall survival (OS), defined as the length of time from diagnosis (the date the patient was first diagnosed with lung cancer, usually by imaging considered highly suspicious for malignancy or date of biopsy) to death from all causes in months. These outcomes were defined according to the National Cancer Database 14 and the US Census Bureau 15 .
Missing data accounted for less than 5%, which will reflect on the total number of individuals (N), as shown in Tables 1 and 3. Table 1 details other variables used in the study. For determining education and income level, the zip code of patient's residence reported at the time of diagnosis was used as a proxy. Data on zip codes were collected in the cancer registry and linked with the 2010 census data 15 . The cut-off point was defined as the median values for the state of Massachusetts, so that zip codes referring to median-income geographic locations in the lower bracket characterized patients with income and education levels lower than the state median. Patient's distance from the healthcare center were calculated using an algorithm was developed using Google Maps.
Survival rates were analyzed considering the (I) adjusted survival analysis (Kaplan-Meier) for the overall cohort; the (II) adjusted survival analysis (Kaplan-Meier) according to stage at diagnosis; and (III) stratified cox proportional hazard models. Survival models were adjusted by age, gender, race, Hispanic origin, education level, income level, distance to healthcare services, smoking status, comorbidities, health insurance, histology, primary site of lesion, laterality, and year of diagnosis.
Factors associated with diagnosis of clinical stage I and II lung cancer (primary outcome) were identified by bivariate and multivariate backward stepwise logistic regression (Log-likelihood ratio statistic (LR)) at 95%CI. Statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS) version 22.

RESULTS
The cancer registry recorded 2,730 lung cancer patients from January 2009 to December 2017. All patients were followed until August 31 st , 2018, and five were lost to follow-up. Most patients were female (51.5%) and white (94.6%), with average age of 67.9 years and median age of 68.0 years (IQR 60-76); 6.6% of them never smoked. Among the study cohort, 3% lived in areas with income level higher than the state median and 37% in areas with education level higher the state median (Table 1). The five-year overall survival for the study cohort was 21%, approaching 43% for patients diagnosed with early-stage lung cancer (Stage I and II) and dropping to 8% among individuals diagnosed at more advanced stages (Stage III and IV) (Figure). The median survival rate was a little over one year (12.7 months) for the overall cohort, reaching 49 months for diagnosed at an early stage and dropping to 9 months among those diagnosed at late stages.   p-value = 0.071 in the bivariate analysis); however, such association was not maintained in the final cox regression model.
As stage at diagnosis was the most significant variable for a greater survival rate, we investigated factors associated with a higher likelihood of being diagnosed at stage I and II disease. Among the 2,598 cancer patients, 902 (35%) were diagnosed with early-stage lung cancer (   (Table 4).

DISCUSSION
Our results indicate that stage at diagnosis was the primary factor associated with lung cancer survival. Even after adjusting for other variables, diagnosis at stages I and II was associated with a higher survival rate when compared to late-stage diagnosis (III and IV).
Corroborating the literature on the theme 16,17 , we found worse survival rates among older individuals. Moreover, male individuals showed poor overall survival when compared to females, also in line with other studies findings 18,19 . Individuals who never smoked presented increased survival at all stages diagnosis, which reiterates the fact that heavy smokers often have an unhealthy lifestyle, worsening their overall survival 20 .
We verified an association between increased survival of lung cancer patients and diagnosis after 2014, which is coherent with the advent of new therapies and techniques 21 . Stage at diagnosis and treatment are more important predictors of survival than race, suggesting that racial disparities in lung cancer survival may disappear provided that early detection efforts benefit both Black and white individuals in the same way 22 . This can be verified, for example, by the reduction in racial disparities in timely cancer treatment arising from the expansion of the Medicaid insurance health coverage 23 .
Over the last 10 years, lung cancer incidence rate has been decreasing at a 2.3% rate yearly, while that of death decreased at approximately 2.9% 3 . The percentage of early-stage diagnosis is increasing over time, which might indicate a progress on lung cancer awareness and a reflection of the compliance with screening recommendations. However, different populations still face great disparities in diagnosis and treatment [23][24][25] , suggesting a multifactorial problem. The ethnic and racial differences surrounding cancer care are complex, extending beyond access to healthcare -a broad term that includes not only the means to visit medical providers, but also the possibility of doing so timely. Thus, cancer care includes both individual components as well as health policies regulating care 7 .
Identifying factors associated with early-stage diagnosis allow us to define intervention points within the health system that could improve access and quality of care, thus increasing overall survival.
In this study, we verified considerable race and gender disparities in early-stage lung cancer diagnosis. Several factors associated with early diagnosis -female gender, older age, and white race -have likewise been correlated with greater access to care and increased survival in other studies [26][27][28][29] . Besides racial disparities reflected on the finding that Black patients are often diagnosed with lung cancer at later stages, these individuals also have lower stage-specific survival for most cancer types. When compared to white patients, the relative risk of death among Black individuals is 33% higher, even after adjusting for gender, age, and stage at diagnosis 2 . The gender differences in healthcare are multifactorial and influenced by structural, psychosocial, and behavioral determinants of health 19,22,23 . Studies show that women tend to seek more healthcare regarding mental and physical issues than men 30 , which might explain why female individuals accounted for a greater percentage of patients diagnosed with early-stage lung cancer in our study.
An important aspect of this study is that the cohort does not reflect the environment where the institution is located. Located in a diverse country, Worcester is likewise a relatively diverse city, whose population consists of 57.1% white individuals, 20.9% Hispanic or Latinos, 11.8% African-American, and 7.29% Asians 24 . However, our study cohort consisted of 94.6% white individuals, 2.3% African-Americans, and 3.2% other races. White individuals are almost twice as likely to be diagnosed with lunger cancer at an early stage. A recent study showed that Medicaid expansion as part of the Affordable Care Act (ACA) reduced racial disparities in access to care 23 . This finding indicates that strategies, policies, and programs must use the health system as an instrument to reduce structural disparities surrounding proper and timely access to healthcare.
This study did not intended to capture all predictors of access to diagnosis, for our cohort consists of individuals that have already accessed the healthcare system. Thus, further studies should adopt a community level approach to determine the barriers of access to healthcare.
This study has some limitations as to the census variables used to estimate income and education level, given that the zip code served as reference for analysis. For example, 'High-income' refers to areas whose median income was greater than the median income of the city of Massachusetts. However, individuals median income may be higher than that of the zip code in which they resides, so that their access to healthcare services may be different than the overall access of the population from that particular zip code area.
Despite being a single-center study, our study findings echo those from the literature, providing evidence of the important disparities in lung cancer diagnosis and treatment and highlighting the need for addressing to provide a more equitable access to health.

CONCLUSIONS
Stage at diagnosis was the most decisive factor for lung cancer survival. Non-white and male individuals were more likely of being diagnosed at late stages. This study providing information for the population attended at this institution, besides outlining the possible pathways to reduce inequities.
In a continuous effort to improve early diagnosis and equitable access to healthcare, further studies are needed to identify the barriers to access to lung cancer diagnosis and treatment at the community level, thus helping to reduce mortality and enhance overall survival.